I Summary of Best Ti and Td Vq Performance Speaker Recognition Using Hidden Markov Models, Dynamic Time Warping and Vector Quantisation

نویسندگان

  • Kin Yu
  • John Mason
  • John Oglesby
چکیده

1 Illustration of the segmentation of the database collected over a period of three months into training and 3 %Error against total number of mixtures for TI ergodic CDHMMs (10 version training) 7 %Error against the number of training versions for a TI 32 element VQ, and 32 mixture single state CDHMM 11 8 %Error against the number of training versions for TD DTW, 8 element VQ and 1 mixture 8 state CDHMM 11 9 DTW text-dependent digit Abstract This paper evaluates continuous density hidden Markov models (CDHMM), dynamic time warping (DTW) and distortion-based vector quantisation (VQ) for speaker recognition, emphasising the performance of each model structure across incremental amounts of training data. Text-independent (TI) experiments are performed with VQ and CDHMMs, and text-dependent (TD) experiments are performed with DTW, VQ and CDHMMs. We show for TI speaker recognition, VQ performs better than an equivalent CDHMM with one training version, but is outperformed by CDHMM when trained with ten training versions. For TD experiments we show that DTW outperforms VQ and CDHMMs for sparse amounts of training data, but with more data, the performance of each model is indistinguishable. The performance of the TD procedures is consistently superior to TI, which is attributed to subdividing the speaker recognition problem into smaller speaker-word problems. We also show a large variation in performance across the diierent digits, concluding that digit zero is the best digit for speaker discrimination.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker recognition models

This paper evaluates continuous density hidden Markov models (CDHMM), dynamic time warping (DTW) and distortion-based vector quantisa-tion (VQ) for speaker recognition, across incremen-tal amounts of training data. In comparing VQ and CDHMMs for text-independent (TI) speaker recognition , it is shown that VQ performs better than an equivalent CDHMM with one training version, but is outperformed...

متن کامل

Feature Extraction and Classification for Automatic Speaker Recognition System – A Review

Automatic speaker recognition (ASR) has found immense applications in the industries like banking, security, forensics etc. for its advantages such as easy implementation, more secure, more user friendly. To have a good recognition rate is a pre-requisite for any ASR system which can be achieved by making an optimal choice among the available techniques for ASR. In this paper, different techniq...

متن کامل

Speaker Recognition with Small Training Requirements Using a Combination

Vector Quantisation (VQ) has been shown to be robust in speaker recognition systems which require a small amount of training data. However the conventional VQ-based method only uses distortion measurements and discards the sequence of quantised codewords. In this paper we propose a method which extends the VQ distortion method by combining it with the likelihood of the sequence of VQ indices ag...

متن کامل

Comparison of Vq and Dtw Classifiers for Speaker Verification

An investigation into the relative speaker verification performance of various types of vector quantisation (VQ) and dynamic time warping (DTW) classifiers is presented. The study covers a number of algorithmic issues involved in the above classifiers, and examines the effects of these on the verification accuracy. The experiments are based on the use of a subset from the Brent (telephone quali...

متن کامل

A Proportional Study on Feature Extraction Method in Automatic Speech Recognition System

Automatic speech recognition (ASR) has been the focus of many researchers for several years. In speech recognition system is for a computer be able to "hear,” understand," and "act upon" spoken information. The speaker recognition system viewed as working in a Analysis , Feature extraction , Modeling , Testing/Matching techniques .speech processing is to convey information about words, speaker ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995